Making Sense of the Legendre Transform 
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The Legendre transform is an important tool in theoretical physics, playing a critical role in 
classical mechanics, statistical mechanics, and thermodynamics. Yet, in typical undergraduate or 
graduate courses, the power of motivation and elegance of the method are often missing, unlike the 
treatments frequently enjoyed by Fourier transforms. We review and modify the presentation of 
Legendre transforms in a way that explicates the formal mathematics, resulting in manifestly sym- 
metric equations, thereby clarifying the structure of the transform algebraically and geometrically. 
Then we bring in the physics to motivate the transform as a way of choosing independent variables 
that are more easily controlled. We demonstrate how the Legendre transform arises naturally from 
statistical mechanics and show how the use of dimensionless thermodynamic potentials leads to 
more natural and symmetric relations. 
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INTRODUCTION 

The Legendre transform is commonly used in up- 
per division and graduate physics courses, especially 
in classical mechanics, [l[ statistical mechanics, and 
thermodynamics. 0, 0] Most physics majors are first ex- 
posed to the Legendre transform in classical mechanics, 
where it provides the connection between the Lagrangian 
C{q) and the Hamiltonian TL{p), and then in statistical 
mechanics where it yields relations between the inter- 
nal energy E and the various thermodynamic potentials. 
Despite its common use, the Legendre transform often 
appears in an ad hoc fashion, without being presented as 
a general and powerful mathematical tool in the way the 
Fourier transform is. 

In this paper we present a pedagogical introduction 
to the Legendre transform, discuss it as a mathematical 
process, and display some of its general properties. Since 
some students prefer algebraic approaches and some pre- 
fer geometric ones, we discuss the transform from both 
points of view and relate them. We then motivate the 
transform in terms related to physical conditions and 
constraints. We emphasize some of the symmetries and 
structures of the transform and present a series of increas- 
ingly complex examples beginning with classical mechan- 
ics and going through examples in statistical mechanics. 
We end with some remarks on more general versions of 
the Legendre transform, as well as other areas in which 
it is widely used. 



THE LEGENDRE TRANSFORM AS AN 
ALTERNATIVE WAY TO DISPLAY 
INFORMATION 

In our experience, many students can manage the 
rules for generating a Hamiltonian from a Lagrangian or 
switching between thermodynamic potentials quite well, 



but express discomfort when asked about the Legendre 
transform as a general mathematical tool. One possi- 
ble reason is that in introductory physics we often treat 
a function as a relation between physical rather than 
mathematical quantities. Thus, when we are thinking 
about physical functions we tend not to pay attention to 
the particular functional form the mathematical function 
uses to encode physical information. Q For example, if we 
are describing a position as a function of time, we might 
write it as x(t). We do not bother to change the sym- 
bol x if we decide to give t in milliseconds instead of in 
seconds. If we write the temperature as a function of po- 
sition as T(r), we do not change the symbol if we switch 
to a different coordinate system or measuring scale. In 
contrast, the Legendre transform is explicitly about how 
information is coded in the functional form. 

In addition, students are usually first introduced to 
the Legendre transform as the transformation in classi- 
cal mechanics from the Lagrangian to the Hamiltonian. 
This transformation involves the switch from the velocity 
to the momentum variable in the non-relativistic kinetic 
energy. In the context of non-relativistic particle mo- 
tion with velocity independent potentials, the transform 
involves the kinetic energy, the most trivial function to 
which the Legendre transform can be applied. The result 
looks like a shift in units (from v to mv as an indepen- 
dent variable) so that it seems pointless. Because the 
position variable q plays no role in the transform and 
typically appears only in V, the result is often regarded 
as a mysterious change of the sign of V: C = T — V vs. 
H = T + V. 

In the rest of this section we motivate the Legendre 
transform as a general mathematical transformation and 
describe a method that displays its general properties 
and symmetries. 

For clarity, we begin with a single variable x and con- 
sider multivariate functions later. Generally, a function 
expresses a relation between two parameters: an inde- 
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pendent variable or control parameter x and a dependent 
value F. This information is encoded in the functional 
form of F(x). For later convenience, we will also denote 
such a relationship or "encoding" as {F, x}. 

In some circumstances it is useful to encode the infor- 
mation contained in a function F(x) in a different way. 
Two common examples are the Fourier transform and the 
Laplace transform. These transforms express the func- 
tion F as sums of (complex or real) exponentials, and 
display the information in F in terms of the amount of 
each component contained in the function rather than 
in terms of the value of the function. In the notation in- 
troduced above, ^F, k\ encodes the same information as 

{F, x}. For the Fourier transform, F (k) = J e lkx F (x) dx 
is an explicit "transformation" between the two encod- 
ings. 

Given an F(x), the Legendre transform provides a 
more convenient way of encoding the information in the 
function when two conditions are met: (1) The function 
(or its negative) is strictly convex (second derivative al- 
ways positive) and smooth (existence of "enough" con- 
tinuous derivatives). (2) It is easier to measure, control, 
or think about the derivative of F with respect to x than 
it is to measure or think about x itself. 

Because of condition (1), the derivative of F(x) with 
respect to x can serve as a stand in for x; that is, there 
is a one-to-one mapping between x and dF/dx. (We re- 
mark on relaxing this condition in the last section.) The 
Legendre transform shows how to create a function that 
contains the same information as F(x) but as a function 
of dF/dx. 

THE MATHEMATICS OF THE LEGENDRE 
TRANSFORM 

We first consider a single, smooth convex function of 
a single variable. There are many equivalent ways to 
characterize convex functions. The most convenient one 
is that the second derivative d 2 F(x)/dx 2 is always posi- 
tive. A second characterization of our condition is that 
the slope function 

_ dF(x) 

is a strictly monotonic function of x (since this also per- 
mits us to treat functions whose negative is convex). 

A graphical way to see how the value of x and the 
slope of a convex function can stand in for each other 
can be seen by considering the example in Fig. 1, where 
the curve drawn to represent F is convex. As we move 
along the curve to the right (as x increases), the slope of 
the tangent to the curve continually increases. In other 
words, if we were to graph the slope as a function of 
x, it would be a smoothly increasing curve, such as the 
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FIG. 1: The graph (blue online) of a convex function^ (:r). 
The tangent line at one point is illustrated (red online). 
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FIG. 2: The graph of s (x), the slope of a convex function. 

example in Fig. 2. If the second derivative d jffi exists 
(everywhere within the range of x in which F is defined; 
part of the condition for a smooth F), there is a unique 
value of the slope for each value of x, and vice versa. The 
corresponding mathematical language is that there is a 1 
to 1 relation between s and x; that is, the function s(x) is 
single- valued and can be inverted to give a single- valued 
function x(s). 

In this way, we could then start with s as the indepen- 
dent variable, use the inverse function to get an unique 
value of x, and then insert that into F(x) to access F 
as a function of s. The standard notation for such a 
function is F(x(s)). If we insist on a new encoding of 
the information in F (in terms of s instead of x), this 
straightforward "function of a function approach" would 
appear to be the most natural way. 

Instead, the Legendre transform of F(x) is defined 
quite differently, and seemingly quite unnatural: 

G(s) = sx(s) - F(x(s)) . (2) 
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Typically, this formula is presented with little motivation 
or explanation, and leaves the students to ponder: Why? 
Why the extra sx? Why the minus sign? Frequently, the 
instructor or the author (of a textbook) invokes another 
magical relation to answer such queries. Only with this 
peculiar definition can we have the property that "the 
slope of G(s) is just x" : 



dG 

x{s) = — . 
as 

This result also requires a careful calculation. 



(3) 



A graphic-geometric approach 

Before providing ways to appreciate this definition of 
the Legendre transform, as well as how never to forget 
"which sign goes where," we present a graphical route to 
the transform. Consider the plot of F versus x in Fig. 3. 
Choose a value of x, which is represented by the length 
of the horizontal line labeled by x. Go up to the value 
on the function curve, F(x). This value corresponds to 
the length of the vertical line labeled by F. Next, draw 
the tangent to the curve at that point. The slope here 
is labeled s, as emphasized by the call out bubble. Ex- 
tend this tangent until it hits the ordinate (the " F axis" ) . 
In this example, the intercept is negative and is labeled 
as — G, with a positive G. This value corresponds to the 
length of the thick vertical line labeled by G. This length 
is reproduced (thin line) just below the line labeled F. 
Because the slope of the tangent is s, the length of the 
dotted vertical line is sx. From this picture, it is quite 
clear that sx — F + G. In this light, the peculiar defini- 
tion of the Legendre transform in Eq. @ appears natural. 
The minus sign in the definition is seen as a way of re- 
taining the symmetry and simplicity of the geometrical 
statement: "In the triangle, the slope (tangent) times the 
adjacent side equals the opposite side, which is the sum 
of F and G." 




FIG. 3: Graphic representation of the Legendre transform, 
G (s), of F (x). See text for an explanation of various quanti- 
ties (color online). 



relationship as: 



{F,x}^{G,s} 



(4) 



Specifically, if we perform the Legendre transform a sec- 
ond time, we recover the original function. (If the re- 
striction of convexity is relaxed, this statement must be 
revised, as remarked in the final section.) In other words, 
suppose we start with the function G(s) and calculate its 
Legendre transform. Of course, as we will see, G(s) sat- 
isfies our conditions: convex and smooth. So, we start 
with 



dG 

y{ s ) = ~r 

ds 



(5) 



and invert the monotonic function y(s) to s(y). Next, we 
construct 



H(y)=ys(y)-G(s(y)) , 
which can be rewritten as 



(6) 



Symmetric representation of the Legendre transform 

This symmetric, geometrical construction allow us to 
display a number of useful and elegant relations that shed 
light on the workings of the Legendre transform. In par- 
ticular, we consider the symmetries associated with the 
inverse Legendre transform, extreme values, and deriva- 
tive relations. 

Ordinarily, the inverse of a transformation is dis- 
tinct from the transform itself. For example, an inverse 
Laplace transform is not given by the same formula. The 
Legendre transform distinguishes itself in that it is its 
own inverse. In this sense, it resembles (geometric) du- 
ality transformations. Symbolically, we may denote this 



G = sy-H . (7) 

If we compare this equation and Eqs. we see that 
we can identify {H,y} with {F, x}. Thus, the Legendre 
transform of G is the original function F, leading to the 
statement: the Legendre transform is its own inverse. 
This "duality" of the Legendre transform, shown sym- 
bolically in Eq. ([4]), is best displayed by the symmetric 
form 

G(s) + F(x) = sx . (8) 

This equation should be read carefully. Despite its ap- 
pearance, there is only one independent variable: ei- 
ther s or x. Referred to as a conjugate pair, these 
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two variables are related to each other, through either 
x(s) = dG(s)/ds or s(x) = dF(x)/dx. A careful writing 
of Eq. (flT|) would read either G(s) + F(x(s)) = sx(s) or 
G(s(x)) + F(x) — s(x)x. To double check the consistency 
with Eqs. |T|) and ([3]), we can start with, say, the first of 
these equations and differentiate with respect to s. Ap- 
plying the chain rule for dF/ds = (dF/dx)(dx/ds), we 
recover dG/ds = x. 



Properties of the extrema 

The example in Fig. 3 shows a convex function F{x) 
with a unique minimum. Let us denote this point by 
frain — ■F , (^min)- The slope of the tangent vanishes here, 
that is, s(a; m i n ) = 0. If we substitute this point into 
Eq. ([2]), we find that the minimum value of F is 



Fnin = -G(0) 



(9) 



It is straightforward to show that a "dual" relation ex- 
ists, namely, the minimum value of G is G m [ n = —F(0). 
To appreciate the geometric meaning of this equation, 
we need only to inspect Fig. 3 and see that — G, the y- 
intercept of the tangent to the curve F(x), never reaches 
beyond F(0). 

Exploiting Eq. (fTTj) . both this special example and 
the case of general extrema can be cast in an "easy-to- 
remember" symmetric form. Suppose F takes on its ex- 
tremal value at x ex t, then we have a horizontal tangent 
line and by definition, s(a; ex t) = 0. Similarly, if G is at 
its extremum at s ox t, we have a;(s ex t) = due to Eq. |3]). 
In either case, the right hand side of Eq. (fTTj) vanishes 
and we have 

G(0) + F{x ext ) = and G(s oxt ) + F(0) = . (10) 



Symmetric representation of the higher derivatives 

Since the Legendre transform is a "dual" relationship, 
we can expect manifestly symmetric relations beyond the 
ones we have seen so far: 



and 



G(s) + F(x) = sx 



dG , dF 

— — = x and — — = s 

ds dx 



(11) 



(12) 



From these, we can obtain an infinite set of relations 
linking G and F by taking derivatives of G + F = sx 
with respect to s or x. Because each function depends on 
only one variable, the differentials can be easily identified. 
Thus, differentiating the equations in (|12p with respect 
to s or x as appropriate, we find 



d 2 G 
ds 2 



dx 
ds 



and 



d 2 F 
dx 2 



ds 
dx 



(13) 



But (dx/ds) (ds/dx) = 1, so we have 

'd 2 G\ (<PF_ 
dx 2 



1 



(14) 



Let us emphasize once more that the variable s in the 
first factor and the x in the second are not independent, 
but linked through Eqs. (fT2jl ! 

Equation (fT4|) illustrates the importance of (strict) 
convexity so that neither derivative ever vanishes. An 
interesting result is that the local curvatures of the Leg- 
endre transforms are inverses of each other in a manner 
reminiscent of the uncertainty relation AxAk « 1. For 
simplicity, suppose F is dimensionless but x is not,[6| so 
that s has the dimension of X/x. With this convention it 
is easy to check the units of Eqs. (fTTI [P2l and [14"]) . 

If we differentiate Eq. (| 14[) again, we can write a sym- 
metric relation for the third derivative: 



d 3 G 
ds 3 



d 2 G 



ds 2 



-3/2 



d 3 F 

dx 3 



d 2 F 



-, -3/2 



dx 2 



= 



(15) 



Notice that each term is again dimensionless, since the 
units of the various derivatives precisely cancel. 

It is possible to derive an infinite set of such relations 
for higher derivatives by differentiating further. Such an 
exercise also shows that if F is smooth (with a well de- 
fined n th derivative), then so is G. The relations for 
higher derivatives do not have forms as simple as Eqs. 
dHJ, (HJ), Hll) and HU), but become more and more 
complex. 



EXAMPLES OF THE LEGENDRE TRANSFORM 
IN SINGLE-PARTICLE MECHANICS 

It is useful to provide some physical examples to illus- 
trate these relations. The simplest is a quadratic func- 
tion F(x) = ax 2 /2. For this function we easily find that 
s — ax and x — s/a, leading to G(s) — s 2 /2a. The cur- 
vatures in F and G (a and 1/a, respectively) are inverses 
of each other as required by Eq. (jT4j) . All derivative re- 
lations beyond this level are trivial, i.e., = 0. 

This example corresponds to a single non-relativistic 
particle with mass m moving in an external potential 
V(q). The Legendre transform connects the Lagrangian 
C(q) to the Hamiltonian TL{p). Only the kinetic term, 
which depends on q or p is affected by the transform, as 
the potential depends on an entirely different variable: q. 
There, x — > q, F — > C, a — * m, s — > p, and G — > TC, so 
that C = mq 2 /2 <^> H = p 2 /2m. However, since V(q) 
is just a "spectator" in the Legendre transform, it must 
appear with opposite signs in F and G (i.e., C and TC), 
in order to satisfy F + G = sx (i.e., C + TL = pq, with no 
q anywhere). Thus, we see the origin of the mysterious 
sign change in V when we go from the Hamiltonian to 
the Lagrangian. 
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Relativistic kinetic energy is a more interesting case. 
Here, we go the other way and start with momentum 
and generate a velocity as the slope of the function. The 
relativistic kinetic energy as a function of momentum is 
(with c 



H{p) = \/p 2 + m 2 
slope at a point p is 

_ dH 
dp 

giving the familiar result 



1). H(p) is convex and its 



(16) 



= mv/y 1 — v 2 . 



(17) 



Creating the Legendre transform using this pair of vari- 
ables leads to the LagrangianQ 



C(v) = pv — 7i{p(v)) = —my 1 — v 2 



(18) 



This example can also be written in terms of the function 
F(x) — cosh Ax. The demonstration is left to the reader. 
(Hint: see Ref. &) 

Let us turn to a less familiar example, one that is so 
trivial that it does not appear in typical textbooks. Yet 
it sets the stage for examining the role of the Legendre 
transform in equilibrium statistical mechanics. 

Consider a particle in a one-dimensional convex po- 
tential well, U{x), which has a unique minimum at x m j n . 
An example would be a particle attached to a wall by 
a non-ideal spring, with x being the distance from the 
point where the coils of the spring are fully compressed. 
The potential is effectively infinite at x = 0, decreases to 
a minimum at its natural extension, and then increases 
for larger x. (We restrict our attention to positive values 
of x, but less than the breaking point of the spring.) An- 
other example of U is the potential that binds two atoms 
into a molecule (though such {/'s are rarely convex for 
all separations). 

The particle is stationary only if it is at x m i n for all 
time. If it is subjected to an additional external applied 
force /, then it will reach a new stationary point Xo, 
which is the solution to the equation 



dU 
dx 



f 



(19) 



To emphasize the dependence of this point on /, we write 
x o(f)- We can ask the inverse question: If we want the 
particle to settle at x\ ^ x m i n , what force do we need 
to apply? The answer is /(xi), a force that depends 
on which x we choose. A little thought leads us to the 
explicit functional form: f(xi) = dU/dx\ Xl . There is 
nothing special about the subscripts here and we may 
just as well write 



/(*) = 



dU 
dx 



Although Eq. (|20[) gives /(x) explicitly, we may ask 
if there is a counterpart to U which provides the in- 
verse, x(/), explicitly. If so, we can simply plug / into 
the expression and arrive at the new equilibrium posi- 
tion. The answer is the Legendre transform of U, namely, 
V(f) = f x ~ U{x{f)). We leave it to the reader to show 
that 



x{f) 



dV 
df 



(21) 



(20) 



is the companion to Eq. (f2"0"|) . 

All the details can be worked out for the simple exam- 
ple of the mass on a spring, U(x) = kx 2 /2. This example 
is the analog of the non-relativistic kinetic energy Legen- 
dre transform. The reader can easily demonstrate that 
the Legendre transform equation U + V = fx becomes 
(/ — kx) 2 = 0, yielding the relation between / and the 
new equilibrium point x. 

Note that the information about the system (for exam- 
ple, wall-spring-particle complex) is fully contained in ei- 
ther U or V . The only difference is in the coding: {U, x} 
vs. {V, /}. Although U is the usual potential energy 
associated with putting the particle at x, V is a kind 
of potential associated with the control /. In ordinary 
classical mechanics, such an approach seems unnecessar- 
ily cumbersome for describing the simple problems we 
posed. Thus, it is rightfully ignored in a course on clas- 
sical mechanics. We include the example here only as a 
stepping stone to the Legendre transform in statistical 
mechanics and thermodynamics. There, multiple poten- 
tials are essential. 



THE LEGENDRE TRANSFORM IN 
STATISTICAL THERMODYNAMICS 



The Legendre transform appears frequently in sta- 
tistical thermodynamics when different variables are 
"traded" for their conjugates. 0] Often, one of the vari- 
ables is easy to think about while the other is easy to 
control in physical situations. 

The difficulty with making sense of the Legendre trans- 
form in thermodynamics arises from two causes: (1) 
For historical reasons, Legendre transform variables are 
not always chosen as conjugate pairs. (2) Many vari- 
ables in thermodynamics are not independent and are 
constrained by equations of state, for example, PV = 
Nk B T. 

As an example of the first point, the conjugate to the 
total energy E of a system is the inverse temperature 
(f3 = 1/ksT). Yet, our daily experience with the tem- 
perature T is so pervasive that T is used in most of the 
relations. Thus, the familiar equation 



and x(f) instead of xq(/). 



F = E -TS 



(22) 
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which relates the Helmholtz free energy F to the entropy 
S, obscures the symmetry between (3 and E, as well as 
the dimensionless nature of the Legendre transform. In 
contrast, if we define the dimensionless quantities 

S = S/k B and T = j3F , (23) 

the "duality" between them can be beautifully expressed 



+ S{E) = f3E 



(24) 



To elaborate the second point, we typically encounter 
a bewildering array of thermodynamic functions (for ex- 
ample, entropy, Gibbs and Helmholtz free energies, and 
enthalpy), a slew of variables (energy, temperature, vol- 
ume, and pressure), as well as a jumble of thermodynamic 
relations (with multiple partial derivatives). In general, 
because of the multiple constrained variables, none of 
these examples is as simple as those we have considered, 
compounding the difficulty of both teaching and learning 
this material. 

Before discussing the generation of the standard ther- 
modynamic potentials, we briefly summarize the basics 
of statistical mechanics. We will show how the Legendre 
transform enters thermodynamics through the Laplace 
transform of partition functions in statistical mechanics. 

Equilibrium statistical mechanics is based on the 
hypothesis that for an isolated system, every allowed 
microstate is equally probable. The high probability of 
finding a particular equilibrium macrostate is due to a 
predominance of the number of microstates correspond- 
ing to that macrostate. The classic example is a gas of 
N identical, free, non-relativistic structureless particles, 
confined in a D-dimensional box of volume L D . For this 
system a microstate is specified by the 2DN variables 
corresponding to the positions and momenta of each par- 
ticle, {fi,pi}, with i = 1, . . . , N. Because the total energy 
E is a constant for an isolated system, the fundamental 
hypothesis can be represented as 



(25) 



where P {{fi,Pi}) is the probability of finding the config- 
uration of positions and momenta {fi,pi} and TL is the 
Hamiltonian. In this case TL is explicitly given by 



2m 



(26) 



where m is the mass of each particle and U is the con- 
fining potential, which is zero for each component of 
r G [0, L] and infinite otherwise. 
The normalization factor for P is 



VL{E) = ( 8{E-TL{{i 



(27) 



where the integral is over all {ri,Pi} from — oo to oo. (The 
infinite values of U restrict the actual position integra- 
tions to the volume of the box.) We have also suppressed 
the other variables that Q depends on for now: L and m. 
Note that f2 is just the volume of phase space available 
for our system and is also known as the microcanonical 
partition function. 

The standard approach evaluates the integral in Eq. 
(|27|) as follows. The position integrals can be done ex- 
plicitly because the only dependence of the Hamiltonian 
on position is the confinement of the position integrals 
to the allowed volume. These integrals yield a factor of 
L ND . The momentum integrals are done by computing 
the surface area of a sphere in DN dimensions. 

The entropy is introduced by the definition S = 
fcslnfi. We exploit the "dimensionless entropy" S and 
write 



S(E) = In n(E) 



(28) 



To proceed, we have two choices: the route that empha- 
sizes the mathematics or the physics. 



The route of mathematics 

Our task is straightforward: evaluate integrals with a 
constraint such as Eq. ([27|l . Often, such integrals are not 
easy to perform. However, exploiting the Laplace trans- 
form typically renders the integrand factorizable. For ex- 
ample, the DN integrations in Eq. (|2T|) become products 
of a single integral. Specifically, we consider the Laplace 
transform of Q(E), 



Z{(3) = / n(E)e~ l3E dE 



(29) 



If we substitute Eq. (|27|) for Q(E), the delta function 
permits us to do the E integral giving 



Z{(3) 



(30) 



r.p 



Because TL is a sum over the individual components, 
the integrand factorizes and we have the result: 



-/3H 



-/3h(ri,pi) 



dfdpe 



-0h(r,p) 



N 



(31) 

Being an integral in just 2D dimensions, the expression 
in [...] is much easier to handle. For the classic exam- 
ple above, the integral is simply L D (2-ktti/ f3) D ^ 2 . The 
attentive student will have noticed, from Eq. ([30]l that 
Z is the canonical partition function and can appreciate 
the statement: The two partition functions are related 
to each other through a Laplace transform. 
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To return to our goal, tt(E), we need to perform an 
inverse Laplace transform, that is, 



Sl(E) = / Z(P)eP B d0 , 
Jc 



(32) 



where C is a contour in the complex f3 plane (running 
parallel to and to the right of the imaginary axis). We 
define 



T((3) = - In Z(f3) 
and write the integral as 



S(E) 



-F(0)+0E 



d(3 



(33) 



(34) 



To continue it is necessary to inject some physics. In this 
case, we expect to be considering many particles, that is, 
large N. From Eq. (|31[) . we have T cx N, leading us to 
expect that the range of E of interest is also 0(N). The 
standard tool to evaluate integrals with large exponen- 
tials as integrands is the saddle point (or steepest decent) 
method. Thus, we seek the saddle point in /3, defined by 
setting the first derivative of j3E — J-(f3) to zero: 



d[/3E - T] 



d/3 

In other words, we have 

dT 
~d/3 



= . 



00 



= E . 



(35) 



(36) 



00 



We emphasize that (3q should be regarded as a function 
of E here. 

In this approach, the integral in Eq. (f29|) is well approx- 
imated by evaluating the integrand at the saddle point, 
so that 



or using Eq. 



Q(E) = exp[/3 £ - F(flo)] 



S(E) + F(0 O ) = f3 E , 



(37) 



(38) 



with the understanding that (3q and E are related 
through Eq. (|36|) . There is nothing significant about 
the subscript on (3 and Eq. (|3"8|) is identical to Eq. 
In other words, S and T are Legendre transforms of 
each other. Thus, we see that (for situations involving 
a large parameter, N in this case) the Laplace and Leg- 
endre transforms, Eqs. (|29f and (|38|) respectively, are 
intimately related to each other as a result of the ther- 
modynamic limit. 



The route of physics: interpretation of the 
equilibrium condition 

Under what conditions does the internal energy move 
from one object to another and under what conditions 



can it be changed to work? Part of the answer lies in 
understanding which way the energy will move if we bring 
two different systems into thermal contact. Why does it 
not go always from the system with more energy to the 
one with less? Considering this question leads us back to 
the Legendre transform. 

When two systems (not necessarily of the same size 
or energy) are brought together and the combined sys- 
tem isolated, E tot = E\ + E% will remain a constant 
and can be regarded as the control parameter. The in- 
dividual Ej's are not fixed, and we ask the question: 
Starting at some initial values, how do they wind up at 
the final equilibrium partition {£^,£"2}? The answer 
lies with iS tot (£ tot |£i, £2), the entropy of the combined 
system subjected to the specific partition of E tot into 
{Ei, £2}. The idea is e 5tot counts the number of allowed 
microstates associated with a particular partition and so, 
carries the information of how probable that partition is. 
In general, calculating this quantity is not trivial. How- 
ever, if we focus on systems with extensive entropies, then 
we may write to a good approximation: 6>tot = Si + S2 
with Si — Si (Ei) and £2 = £2 (£2)- These statements 
are not trivial: We are injecting the physics that, under 
the conditions specified, the entropies of each system do 
not depend on the energy of the other. 

Given these assumptions we can ask for what partition 
will 5tot be maximum, or equivalently, which partition is 
the most probable? If we write E2 = E to t — Ei and recall 
that E to t is fixed, this task is easy. The maximum occurs 
at El, defined by 



dS t , 



dEi 



= 



But dEi 



-dEo, so that we have 



dSi 
dE x 



dS 2 
dE 2 



(39) 



(40) 



where ££ = E to t 



El. This result is significant because 



each side does not depend on the parameters of the other 
system. Thus, if we associate a quantity with dS/dE, 
which we define by 



(3{E) = 



dS 
dE 



then Eq. (|4H)l becomes 



(3i(E*i) = f3 2 (E* 2 ) 



(41) 



(42) 



In other words, the most probable partition occurs when 
the f3 of one system equals the (3 of the other. Note that 
this condition does not depend on the details of the two 
systems, such as composition, size, or state (gas, liquid, 
solid, etc.). When the two systems are brought into con- 
tact, energy will flow between them until they settle at 
values given by this condition: the equality of a quantity, 
(3 = dS/dE, associated with each of them separately. It 
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is natural, therefore, to use this quantity for describing 
our daily experience, namely, two systems, one hot and 
one cold, will equilibrate at a common "temperature" (T) 
when brought in contact with each other. Historically, 
many arbitrary scales were used for T. Their relation- 
ships with the more natural quantity /3 were not clarified 
later. 

Besides providing a natural scale to describe "hot" 
and "cold," can this variable (3 be exploited further? 
For a given system, we can write S(E((3)), but is that 
useful? The answer is connected to the canonical en- 
semble, the (Helmholtz) free energy, and the Legendre 
transform of S. There is no need to reproduce here the 
standard derivation of this ensemble and the Boltzmann 
factor e~ /3U . In the previous subsection, we have already 
discussed the transformation between the partition func- 
tions Z((3) and Q(E) and the relationship to the Legendre 
transform between S(E) and F((3). 



How does the Legendre transform enter into 
thermodynamics? 

For convenience we summarize the key relations using 
dimcnsionless potentials: 



Q(E) = e s ^ , Z(/3) = e-rW 



dS_ 

dE 



dT 
~d/3 



E 



(43) 
(44) 



where Z (0) = J dEe^ l3E Vt{E) and, in the thermody- 
namic limit, J- {(3) + S(E) = f3E. We can now see where 
the Legendre transform enters and why it is useful. The 
entropy S is a function of E, but the internal energy is 
typically not easy to control. To put more energy into a 
system (or take some out), we may give it some heat (or 
remove some). In other words, we often manipulate E 
by coupling the system to an appropriate thermal bath 
and so, temperature (or (3) becomes the "control" vari- 
able. In that case, we can perform a Legendre transform 
of S (E) and work with F{f3) instead. Since both {<S, E} 
and { T, /?} contain the same information about our sys- 
tem, it makes sense to deal with the more convenient 
thermodynamic potential when we change the control on 
our system from one variable to another. 

Since the independent variable in a thermodynamic po- 
tential is to be regarded as a control (or a constraint) pa- 
rameter, the "slope" associated with this function (e.g., 
dS/dE, dJ-/d(3) carries physically significant informa- 
tion, namely, the response of the system to this control. 
The Legendre transform simply exchanges the role of the 
variables associated with control and response. For the 
example discussed above, it is physically easier to con- 
trol T. It is also more familiar to think of temperature 
(or (3) as a control and the internal energy as the re- 
sponse. Thus, the free energy T{j3) is the more appro- 
priate potential, with E = dJ- /d(3 being the response. 



In the transformed version, which is mathematically and 
conceptually easier to grasp, E is a constraint (conserved 
variable for an isolated system) and S (E) is the more ap- 
propriate potential. After we understand the significance 
of its slope, dS/dE, we can identify the "response" (3 with 
a measure for temperature. There are many other exam- 
ples of response/control pairs to which the same kind of 
transformation may be applied, such as particle number 
and chemical potential, polarizability and electric field, 
magnetization and magnetic field, etc. 



LEGENDRE TRANSFORM WITH MANY 
VARIABLES 

The thermodynamic potentials depend on many vari- 
ables other than just the total energy E. Each variable 
that can be independently controlled elicits a distinct re- 
sponse. As we construct Legendre transforms for each of 
these control/response variable pairs, we generate a new 
potential. The result is a plethora of thermodynamic 
functions. We again emphasize that all these thermo- 
dynamic potentials carry the same information, but en- 
coded in different ways. We begin this section by dis- 
cussing briefly the mathematical structure of the multi- 
variable Legendre transform and then apply it to ther- 
modynamics and statistical mechanics. 

Consider the multivariate function F(x), where x 
stands for M independent variables: x±, . . . ,Xm- For 
convenience, suppose F is smooth and convex over all of 
this M-dimensional space. At every point x, there will 
be M slopes: 



dF 

0x 7n 



= d m F 



(45) 



and M(M+l)/2 second derivatives, d m dgF, which can be 
regarded as a symmetric matrix. The convexity restric- 
tion requires that all of the eigenvalues of this matrix are 
positive (or negative). Q In the context of thermodynam- 
ics, convexity is the condition for stability in equilibrium 
systems. [l(| A standard corollary is that the relation be- 
tween {x m } and {s m } is 1 to 1, so that we can replace 
any one of the x m 's by the corresponding s m through a 
Legendre transform. 

Because we can transform any number of the x's, we 
may consider (up to) 2 M functions. For example, if we 
restrict ourselves to (E, V) - the standard variables for 
the microcanonical ensemble of the ideal gas - there are 
four thermodynamic functions: entropy, enthalpy, Gibbs, 
and the Helmholtz free energies. One way to picture the 
relation between so many functions is to put them at the 
corners of an A/-dimensional hypercube. Each axis in 
this space is associated with a particular variable pair 
(x m ,s m ). Going from one corner to an adjacent corner 
along a particular edge corresponds to carrying out the 
Legendre transform for that pair. For the M = 2 example 
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of (11,2:2) — (-^j V), the hypercube reduces to a square, 
which is related, but not identical, to the square that ap- 
pears in some texts. [HHH Thanks to the commutativity 
of partial derivatives, going from any corner to any other 



corner is a path independent process, so that the function 
associated with each vertex is unique. For example, if we 
exchange (xg,x m ) for (s^,s m ), the Legendre transform 
relations would be the simple generalization of Eq. (JXTJ) 



F(xi, ...x t ,. 



■ Xm) + G(x%, ...st, 



r 



• xm) = sgx e + s m x r . 



(46) 



with[12j deG = xt, d m G = x m , diF = st, and d m F = 
s m . We should have given this G some special notation 
to denote that its variables are all {x} except for the two 
that are {s}. A possibility is G e,m , but for simplicity, we 
do not pursue this issue further. One special Legendre 
transform is noteworthy - the one in which all variables 
are {s}. Located at the corner of the hypercube diamet- 
rically opposite to F, this function will be denoted by H. 
In this case, the Legendre transform relation simplifies to 



H(s)+F(x) = s 



(47) 



Generalizations for higher derivatives proceed are 
straightforward. For example, Eq. (fl"4"|) becomes 



(48) 



(d e d m H)(d m d n F) = 5 lr , 



where S is the unit matrix. The convexity of F guarantees 
that the inverse of d m d n F exists. 

Let us apply these considerations to the thermodynam- 
ics of a gas. We begin with the microcanonical partition 
function Q (E,V) and consider the mapping 



F(xi,x 2 ) -> S(E,V) = lnfi 



(49) 



2:1 — > E, X2 — > V, si — > f3, S2 — > rj. The last of these 
is related to the pressure P, an issue we will comment 
on later. The Legendre transform with respect to x\ 
leads to the Helmholtz free energy. Our symmetric and 
dimensionless version of F = E — TS is same as Eq. ([24)) : 
T{$, V) + S(E, V) = j3E, with V playing the role of a 
"spectator." Thus, to be precise, we now write Eq. (jITj) 
with a partial derivative: 



P dE 



(50) 



For the second Legendre transform, with respect to 2:2 
V, we define [H 



dS_ 
dV 



and arrive at 



G(P,ri)+S(E,V) =(3E + V V 



(51) 



(52) 



Here, Q = (3G(T, P) is the dimensionless Gibbs free en- 
ergy. Meanwhile, the relationship between 77 and the 



traditional definition of pressure, P 



-8E/dV\ 



Si 



T] = j3P. To show this will take us further afield, into 
the first law of thermodynamics and the notion of heat 
transfer. The interested reader may consult a standard 
text, such as Ref. [l3|. 

Returning to Eq. (|52"1) , we move S and divide both 
sides by (3 to arrive at its more common form: G — 
E — TS + PV. The seemingly mysterious signs of the 
last two terms on the right are, from our perspective, 
due to the placing of S and the use of T instead of (3, 
By contrast, every term comes with a positive sign in 
Eq. (p)2"|) . with all the potentials on the left and all the 
conjugate variables on the right. Note that there are just 
two variables in this example, so that Q plays the role of 
H in Eq. (|47|) . which is an explicit writing of Eq. (|52|) 
here. 

Lastly, we turn to enthalpy, which is laden with extra 
complications. For various reasons, S (instead of E) is 
chosen to be the independent variable for arriving at the 
enthalpy As a result, instead of f3, the natural conju- 
gate variable is T (= dE/dS). Regarding S as a control 
variable with which to access E is conceptually difficult. 
However, it is common to think of transferring heat so 
that TdS appears as the means of control. If we take 
the Legendre transform of E(S) in the standard fashion, 
we would arrive at TS — E, which is the Helmholtz free 
energy except for a sign. The disadvantage is clear, but 
there are advantages to this approach. In particular, by 
starting with E(S, V), we naturally arrive at the ordinary 
pressure, — P, as the conjugate to V (instead of 77). Note 
the extra minus sign here. The Legendre transform with 
respect to V of E(S, V) gives (— P)V — E, the (negative 
of) enthalpy H = E + PV. If we allow logic to over- 
come tradition, we would have defined the last potential 
as H(E, -q) (not to be confused with the Hamiltonian HI) 
through the Legendre transform 



H(E, V )+S(E,V) = V V 



(53) 



in which the first variable, E, plays the role of a spectator. 
But, the beauty of pure reason does not always prevail 
and we must often abide by the results of our historical 
paths. 
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CONCLUDING REMARKS 

There are many interesting aspects of the Legendre 
transform we have not discussed. Covering all aspects 
would be more appropriate for a textbook than a journal 
article. Here, let us conclude by touching on just two 
important generalizations - the Legendre transform of (a) 
non-convex functions and (b) functions defined on spaces 
with non-trivial topology, such as the angle on a circle - 
and providing references for further reading. 

If a function is non-convex, the Legendre transform 
becomes multi-valued. If we delete all but the principal 
branch, the Legendre transform develops discontinuous 
first derivatives. If we perform another transformation, 
the result would be the convex hull of the original. This 
topic is intimately related to the Maxwell construction 
and the co-existence of phases (for example, liquid and 
vapor). Although most texts on thermodynamics and 
statistical mechanics discuss the Maxwell construction, 
few demonstrate its relation to the Legendre transform 
of non-convex functions. The interested reader may find 
a good example of a convexified (free energy) function in 
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A second generalization concerns variables whose do- 
mains have a non-trivial topology, the simplest being 
functions defined on a circle or the surface of a sphere. 
The angles are the most natural variables for a sphere, 
but we must be mindful of the periodic nature of (j) £ 
(0, 2tt] and the co-ordinate singularities at the poles 
9 = 0, 7r. An example is the shape of crystals in equilib- 
rium with its liquid (for example, 4 He crystals in coexis- 
tence with the superfluid 15]) or vapor (for example, gold 
crystals [HI)- Typical crystal shapes are not spherical and 
can be described by a non-trivial function R(0, </>), which 
specifies the distance from the center of mass to a point 
on the crystal surface labeled by (9,<p)- The tangent 
plane at that point can be associated by the direction of 
its normal and labeled by (#,</>). The relation between 
these and the derivatives dgR and d^R exists, but is not 
simple. From these derivatives a (generalized) Legendre 
transform of R can be constructed: a(6,4>). The func- 
tion a is also a significant physical quantity: it is the free 
energy per unit area (surface tension) associated with a 
planar interface, with normal (9,4>), between the crys- 
talline and the isotropic phases of the material. A bonus 
is that, unlike typical thermodynamic potentials such as 
the entropy and free energies, the "potential" R(8, <fi) is 
not just an abstract concept; it can be visualized, being 
displayed explicitly as a shape in three dimensions. Fur- 
ther details of this intriguing connection may be found 
in Refs. E3. 

Finally, we should point the readers to horizons far 
beyond those discussed here. Since our purpose is to 
reach students and instructors in upper undergraduate 
and core graduate courses, we limit our considerations 



to cases with two (or finite M) variables above. Beyond 
this level, it is possible to study the Legendre transform 
with an infinite number of variables. Probably the most 
well known example in physics comes from both quantum 
field theory [3] and statistical field theory Associ- 
ated with each quantum field 4> (r, t) is a "source field" 
J{r,t), in much the same way that a fluctuating local 
magnetization, m (r) , can be "created" by an inhomo- 
geneous magnetic field B (r). In the latter system, the 
fluctuations of m are thermal, rather than quantum, in 
nature. Now, the source field can be regarded as a con- 
trol variable for each r, t (or just r). Thus, there are 
an infinite number of variables, as well as responses, in- 
volved. Corresponding to a given J (f, t) or B (r), we can 
compute, in principle, the "vacuum energy" U [J (r, t)] or 
the free energy T [B (r)]. These carry information on the 
quantities of interest: connected Schwinger functions (ex- 
pectation values of products of (f>'s) or correlations func- 
tions (averages of products of m's). More useful than U is 
its Legendre transform, T. Known as the effective action, 
r displays the essential information more conveniently in 
terms of one particle irreducible (1PI) Schwinger func- 
tions or vertex functions. For the effective action of a 
quantum field, there is a particularly appealing system- 
atic expansion: in powers of h. The zeroth order term 
is just the classical action. Similarly, for the Legendre 
transform of there is a systematic expansion in pow- 
ers of T or . Not surprisingly, the zeroth order term 
here is just the energy associated with m(r), i.e., the 
Hamiltonian TL [m (r)] which enters the Boltzmann fac- 
tor exp{— /3H}. Our hope is that these comments will 
help some students who are struggling with field theory 
or perhaps further motivate those who are enthusiasti- 
cally waiting to delve into the subject. 
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